Dynamical and correlation properties of the Internet 
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The description of the Internet topology is an important open problem, recently tackled with 
the introduction of scale-free networks. In this paper we focus on the topological and dynamical 
properties of real Internet maps in a three years time interval. We study higher order correlation 
functions as well as the dynamics of several quantities. We find that the Internet is characterized by 
non-trivial correlations among nodes and different dynamical regimes. We point out the importance 
of node hierarchy and aging in the Internet structure and growth. Our results provide hints towards 
the realistic modeling of the Internet evolution. 
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Complex networks play an important role in the un- 
derstanding of many natural systems 0, . A network 
is a set of nodes and links, representing individuals and 
the interactions among them, respectively. Despite this 
simple definition, growing networks can exhibits an high 
degree of complexity, due to the inherent wiring entan- 
glement occurring during their growth. The Internet is 
a capital example of growing network with technologi- 
cal and economical relevance; however, the recollection 
of router-level maps of the Internet has received the at- 
tention of the research community only very recently 
H 0. D . The statistical analysis performed so far have re- 
vealed that the Internet exhibits several non-trivial topo- 
logical properties, (wiring redundancy, clustering, etc.). 
Among them, the presence of a power-law connectivity 
distribution ^| makes the Internet an example of the 
recently identified class of scale-free networks ||. 

In this Letter, we focus on the dynamical properties of 
the Internet. We shall consider the evolution of real In- 
ternet maps from 1997 to 2000, collected by the National 
Laboratory for Applied Network Research (NLANR) j|. 
In particular, we will inspect the correlation properties 
of nodes' connectivity, as well as the time behavior of 
several quantities related to the growth dynamics of new 
nodes. Our analysis shows a dynamical behavior with dif- 
ferent growth regimes depending on the node's age and 
connectivity. The analysis points out two distinct wiring 
processes; the first one concerns newly added nodes, 
while the second is related to already existing nodes in- 
creasing their interconnections. A feature introduced in 
the present work refers to the Internet hierarchical struc- 
ture, reflected in a non-trivial scale-free connectivity cor- 
relation function. Finally, we discuss recent models for 
the generation of scale-free networks in the light of the 
present analysis of real Internet maps. The results pre- 
sented in this Letter could help developing more accurate 
models of the Internet. 

Several Internet mapping projects are currently de- 
voted to obtain high-quality router-level maps of the In- 



ternet. In most cases, the map is constructed by using 
a hop-limited probe (such as the UNIX trace-route tool) 
from a single location in the network. In this case the 
result is a "directed" map as seen from a specific point 
on the Internet |5j. This approach does not correspond 
to a complete map of the Internet because cross-links and 
other technical problems (such as multiple IP aliases) are 
not considered. Heuristic methods to take into account 
these problems have been proposed However, it is not 
clear their reliability and the corresponding completeness 
of maps constructed in this way. A different representa- 
tion of Internet is obtained by mapping the autonomous 
systems (AS) topology. Each AS number approximately 
maps to an Internet Service Provider (ISP) and their 
links are inter-ISP connections. In this case it is possible 
to collect data from several probing stations to obtain 
complete interconnectivity maps ^ Qj. In particular, the 
NLANR project is collecting data since Nov. 1997, and 
it provides topological as well as dynamical information 
on a consistent subset of the Internet. The first Nov. 
1997 map contains 3180 AS, and it has grown in time 
until the Dec. 1999 measurement, consisting of 6374 AS. 
In the following we will consider the graph whose nodes 
represent AS and whose links represent the connections 
between AS. 

In dealing with the Internet as an evolving network, it 
is important to discern whether or not it has reached 
a stationary state whose average properties are time- 
independent. As a first step, we have analyzed the be- 
havior in time of several average quantities such as the 
connectivity (k), the clustering coefficient (C) and the 
average minimum path distance (d) of the network flio| |. 
The first two quantities (see Table I) show a very slow 
tendency to increase in time, while the average mini- 
mum path distance is slowly decreasing with time. A 
more clear-cut characterization of the topological prop- 
erties of the network is given by the connectivity distri- 
bution, P(k). In Fig. 1 we show the probability P(k) 
that a given node has k links to other nodes. We report 
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the distribution for snapshots of the Internet at different 
times. In all cases, we found a clear power law behavior 
P(k) ~ fc"T with 7 = 2.2 ± 0.1. The distribution cut-off 
is fixed by the maximum connectivity of the system and 
is related to the overall size of the Internet map. On the 
other hand, the power law exponent 7 seems to be in- 
dependent of time and in good agreement with previous 
measurements Q . This evidence seems to point out that 
the Internet's topological properties have already settled 
to a rather well-defined stationary state. 

Initially, the modeling of Internet considered algo- 
rithms based on its static topological properties p| . 
However, since the Internet is the natural outcome of 
a complex growth process, the understanding of the dy- 
namical processes leading to its present structure must 
be considered as a fundamental goal. From this perspec- 
tive, the Barabasi- Albert (BA) model, Ref. f|, H), can 
be considered as a major step forward in the understand- 
ing of evolving networks. Underlying the BA model is the 
preferential attachment rule j^]; i.e., new nodes will link 
with higher probability to nodes with an already large 
connectivity. This feature is quantitatively accounted for 
by postulating that the probability of a new link to at- 
tach to an old node with connectivity ki, H(ki), is lin- 
early proportional to ki, Tl(ki) ~ ki. This is an intu- 
itive feature of the Internet growth where large provider 
hubs are more likely to establish connections than smaller 
providers. The BA model has been successively modified 
with the introduction of several ingredients in order to 
account for connectivity distribution with 2 < 7 < 3 
p3, 14 , local geographical factors ]l5| , wiring among ex- 
isting nodes and age effects fll7|]. While all these 
models reproduce the scale-free behavior of the connec- 
tivity distribution, it is interesting to inspect deeper the 
Internet's topology to eventually find a few discriminat- 
ing features of the dynamical processes at the basis of 
the Internet growth. 

A first step in a more detailed characterization of the 
Internet concerns the exploration of the connectivity cor- 
relations. This factor is best represented by the condi- 
tional probability P c (k'\k) that a link belonging to node 




k 



FIG. 1: Cumulated connectivity distribution for the 1997, 
1998, and 1999 snapshots of the Internet. The power law 
behavior is characterized by a slope —1.2, which yields a con- 
nectivity exponent 7 = 2.2. 
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FIG. 2: Average connectivity (k nn ) of the nearest neighbors of 
a node depending on its connectivity k for the 1998 snapshot 
of Internet, the generalized BA model with 7 = 2.2, Ref. Q, 
and the fitness model, Ref. The full line has a slope —0.5. 
The scattered results for very large k are due to statistical 
fluctuations. 



Year P997 1998 1999 

{k} 3.47(4) 3.62(5) 3.82(6) 

(C) 0.18(1) 0.21(2) 0.24(1) 

(d) 3.77(1) 3.76(2) 3.72(1) 



TABLE I: Average properties for three different years, (k) is 
the average connectivity, (d) is the minimum path distance 
dij averaged over every pair of nodes (C) is the cluster- 

ing coefficient d averaged over all nodes i, where d is defined 
as the ratio between the number of links between the neigh- 
bors of i and its maximum possible value ki(ki — 1). Figures in 
parenthesis indicate the statistical uncertainty from averaging 
the values of the corresponding months in each year. 



with connectivity k points to a node with connectiv- 
ity k' . If this conditional probability is independent of 
k, we are in presence of a topology without any cor- 
relation among the nodes' connectivity. In this case, 
P c (k'\k) = Pc(fc') ~ k'P(k'), in view of the fact that any 
link points to nodes with a probability proportional to 
their connectivity. On the contrary, the explicit depen- 
dence on A: is a signature of non-trivial correlations among 
the nodes' connectivity, and the possible presence of a 
hierarchical structure in the network topology. A direct 
measurement of the P c {k'\k) function is a rather com- 
plex task due to large statistical fluctuations. More clear 
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indications can be extracted by studying the quantity 
(k n n) = J2k' k'P c (k'\k); i.e. the nearest neighbors aver- 
age connectivity of nodes with connectivity k. In Fig. 2, 
we show the results obtained for the Internet map of 1998, 
that strikingly exhibit a clear power law dependence on 
the connectivity degree (k nn ) ~ k~ v , with v ~ 0.5. This 
result clearly implies the existence of non-trivial corre- 
lation properties for the Internet. The primary known 
structural difference between Internet nodes is the dis- 
tinction between stub and transit domains. Nodes in stub 
domains have links that go only through the domain it- 
self. Stub domains, on the other hand, are connected 
via a gateway node to transit domains that, on the con- 
trary, are fairly well interconnected via many paths. In 
other words, there is a hierarchy imposed on nodes that 
is very likely at the basis of the above correlation prop- 
erties. As instructive examples, we report in Fig. 2 the 
average nearest-neighbor connectivity for the generalized 
BA model with 7 = 2.2 [[b| and the fitness model de- 
scribed in Ref. |Q, with 7 = 2.25, for networks with the 
same size than the Internet snapshot considered. While 
in the first case we do not observe any noticeable struc- 
ture with respect to the connectivity k, in the latter we 
obtain a power-law dependence similar to the experimen- 
tal findings. The general analytic study of connectivity 
correlations in growing networks models can be found in 
Ref. [[l9] . A detailed discussion of different models is be- 
yond the scope of the present paper; however, it is worth 
noticing that a fc-structure in correlation functions, as 
probed by the quantity {k nn ), does not arise in all grow- 
ing network models. 

In order to inspect the Internet dynamics, we focus 
our attention on the addition of new nodes and links 
into the maps. In the three-years range considered, we 
have kept track of the number of links £ n ew appearing 
between a newly introduced node and an already existing 
node. We have also monitored the rate of appearance of 
links t id between already existing nodes. In Table II we 
see that the creation of new links is governed by these 
two processes at the same time. Specifically, the largest 
contribution to the growth is given by the appearance of 
links between already existing nodes. This clearly points 
out that the Internet growth is strongly driven by the 
need of a redundancy wiring and an increased need of 
available bandwidth for data transmission. 

A customarily measured quantity in the case of grow- 
ing networks is the average connectivity (ki(t)) of new 
nodes as a function of their age t. In Refs. || [ll| it 
is shown that (ki(t)) is a scaling function of both t and 
the absolute time of birth of the node to- We thus con- 
sider the total number of nodes born within an small 
observation window Ato, such that to ~ const, with re- 
spect to the absolute time scale that is the Internet life- 
time. For these nodes, we measure the average connec- 
tivity as a function of the time t elapsed since their birth. 
The data for two different time windows are reported in 
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FIG. 3: Average connectivity of nodes borne within a small 
time window Ato, after a time t elapsed since their appear- 
ance. Time t is measured in days. As a comparison we report 
the lines corresponding to t 01 and t ' 5 . 



Fig. 3, where it is possible to distinguish two different 
dynamical regimes: At early times, the connectivity is 
nearly constant with a very slow increase ((ki(t)) ~ t ). 
Later on, the behavior approaches a power law growth 
(ki(t)) ~ t 05 . While exponent estimates are affected by 
noise and limited time window effects, the crossover be- 
tween two distinct dynamical regimes is compatible with 
the general aging form obtained in Ref. []l9|| . In partic- 
ular both the generalized BA model |13| and the fitness 
model |l8| present aging effects similar to those abtained 
in real data. A more detailed comparison would require 
a quantitative knowledge of the parameters to be used in 
the models and will be reported elsewhere. 

A basic issue in the modeling of growing networks con- 
cerns the preferential attachment hypothesis ||]. Gener- 
alizing the BA model algorithm it is possible to define 
models in which the rate II(/c) with which a node with 
k links receive new nodes is proportional to k a . The 
inspection of the exact value of a in real networks is an 
important issue since the connectivity properties strongly 
depend on this exponent Jl4], |2C}| . Here we use a simple 
recipe that allows to extract the value of a by studying 
the appearance of new links. We focus on links emanat- 
ing from newly appeared nodes in different time windows 
ranging from one to three years. We consider the fre- 



Year 


1997 


1998 


1999 


^new 
lold 
£new /&old 


183(9) 
546(35) 
0.34(2) 


170(8) 
350(9) 
0.48(2) 


231(11) 
450(29) 
0.53(3) 



TABLE II: Monthly rate of new links connecting existing 
nodes to new (£ n em) and old (£ id) nodes. 



4 



S -3 

faC 

o 



• links from new nodes 
a links from old nodes 




PB97-0693. We thank M.-C. Miguel, Y. Moreno- Vega, 
and R. V. Sole for helpful comments and discussions. 



l°gio k 



[1] 
[2] 

[3. 



[4 



[5 



FIG 4: Cumulative frequency of links emanating from new 



and existing nodes that attach to nodes with connectivity k. 
The straight line corresponds to a slope —0.2. The flat tail is 
originated from the poor statistics at the very high k values. 



quency fi(k) of links that connect to nodes with connec- 
tivity k. By using the preferential attachment hypoth- 
esis, this effective probability is fj,(k) ~ k a P(k). Since 
we know that P(k) ~ /c~ 7 we expect to find a power law 
behavior fi(k) ~ fc"~ 7 for the frequency. In Fig. 4, we 
report the obtained results that shows a clear algebraic 
dependence fi(k) ~ k . By using the independently 
obtained value 7 = 2.2 we find a preferential attach- 
ment exponent a ~ 1.0, in good agreement with the re- 
sult obtained with a different analysis in Ref. J2(]]. We 
performed a similar analysis also for links emanated by 
existing nodes, recovering the same form of preferential 
attachment (see Fig. 4). 

In summary, we have shown that the Internet map ex- 
hibits a stationary scale-free topology, characterized by 
non-trivial connectivity correlations. An investigation of 
the Internet's dynamics confirms the presence of a prefer- 
ential attachment behaving linearly with the nodes' con- 
nectivity and identifies two different dynamical regimes 
during the nodes' evolution. We point out that very likely 
several other factors, such as the nodes' hierarchy re- 
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